Interpretable machine learning models for detecting peripheral neuropathy and lower extremity arterial disease in diabetics: an analysis of critical shared and unique risk factors

Background Diabetic peripheral neuropathy (DPN) and lower extremity arterial disease (LEAD) are significant contributors to diabetic foot ulcers (DFUs), which severely affect patients’ quality of life. This study aimed to develop machine learning (ML) predictive models for DPN and LEAD and to identify both shared and distinct risk factors. Methods This retrospective study included 479 diabetic inpatients, of whom 215 were diagnosed with DPN and 69 with LEAD. Clinical data and laboratory results were collected for each patient. Feature selection was performed using three methods: mutual information (MI), random forest recursive feature elimination (RF-RFE), and the Boruta algorithm to identify the most important features. Predictive models were developed using logistic regression (LR), random forest (RF), and eXtreme Gradient Boosting (XGBoost), with particle swarm optimization (PSO) used to optimize their hyperparameters. The SHapley Additive exPlanation (SHAP) method was applied to determine the importance of risk factors in the top-performing models. Results For diagnosing DPN, the XGBoost model was most effective, achieving a recall of 83.7%, specificity of 86.8%, accuracy of 85.4%, and an F1 score of 83.7%. On the other hand, the RF model excelled in diagnosing LEAD, with a recall of 85.7%, specificity of 92.9%, accuracy of 91.9%, and an F1 score of 82.8%. SHAP analysis revealed top five critical risk factors shared by DPN and LEAD, including increased urinary albumin-to-creatinine ratio (UACR), glycosylated hemoglobin (HbA1c), serum creatinine (Scr), older age, and carotid stenosis. Additionally, distinct risk factors were pinpointed: decreased serum albumin and lower lymphocyte count were linked to DPN, while elevated neutrophil-to-lymphocyte ratio (NLR) and higher D-dimer levels were associated with LEAD. Conclusions This study demonstrated the effectiveness of ML models in predicting DPN and LEAD in diabetic patients and identified significant risk factors. Focusing on shared risk factors may greatly reduce the prevalence of both conditions, thereby mitigating the risk of developing DFUs. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-024-02595-z.


Introduction
Diabetes has become a growing global health concern, affecting around 451 million people in 2017, with projections to rise to 693 million by 2045 [1].In China, the prevalence of diabetes has surged from less than 1% in the 1980s to approximately 10.9% in 2013 and 12.4% in 2018, making it the country with the world's largest diabetic population [2].Diabetic peripheral neuropathy (DPN) and lower extremity arterial disease (LEAD) are prevalent complications of diabetes, with occurrence rates of about 50% and 3-20%, respectively [3][4][5][6].Both complications serve as extrinsic risk factors for diabetic foot ulcers (DFUs) [7], leading to higher rates of amputation, increased mortality, and substantial economic burdens for patients with diabetes [8].Unfortunately, patients with DPN or LEAD may be asymptomatic in their early stages, and many patients already have these complications at the time of initial diagnosis [8,9].Therefore, early identification and management of DPN and LEAD are crucial in preventing DFUs among diabetic patients.
At present, the diagnosis of DPN and LEAD mainly relies on physical examination of the peripheral nervous system, electromyography (EMG), ankle-brachial index, lower limb vascular ultrasound, etc. [5,10].However, these methods require well-trained endocrinologists and specialized diagnostic equipment, which are often scarce in underdeveloped regions.To address this challenge, researchers are actively exploring the development of practical, accessible, and cost-effective clinical diagnosis models for DPN and LEAD based on clinical features and routinely measured lab parameters.Recent studies have demonstrated that machine learning (ML) models, by utilizing medical history, physical examinations, and basic lab tests, could effectively predict DPN and LEAD [4,11].Moreover, ML algorithms achieved high accuracy in identifying DPN through the analysis of immune biomarkers or microcirculatory parameters [12,13].Additionally, a model based on support vector machine (SVM) has been shown to accurately predict DPN severity in about 76% of cases, utilizing general patient information and responses from a neuropathy disability score questionnaire [14].Despite these advances, most studies have concentrated on developing one model for either DPN or LEAD, without considering common risk factors for both conditions in a single study.Given the high prevalence of DPN and LEAD in developing countries and their potential to lead to DFUs, which can significantly increase mortality rates [15,16], it is crucial to identify and target shared risk factors for both conditions.This strategy could facilitate early and concurrent interventions, potentially diminishing the prevalence and severity of these diseases.
In this study, we employed logistic regression (LR), random forest (RF), and eXtreme Gradient Boosting (XGBoost) to develop diagnostic models for both DPN and LEAD among diabetic individuals, utilizing demographic, clinical, and laboratory information.This research spanned the interdisciplinary fields of medicine, biostatistics, and ML.To identify shared and unique risk factors for DPN and LEAD, we used the SHapley Additive exPlanation (SHAP) method to prioritize risk factors within the most effective models for each condition.

Contributions of this work
The major contributions of this study are outlined as follows: (1) We constructed ML models for DPN and LEAD detection based on accessible demographic, clinical, and laboratory data, minimizing the need for specialized tests and advanced medical facilities.This approach is especially beneficial for areas with limited healthcare resources.(2) We utilized three feature selection methodsmutual information (MI), random forest recursive feature elimination (RF-RFE), and the Boruta algorithm-to identify the most significant features.This strategy effectively reduces overfitting and enhances the robustness of our models.(3) To optimize the performance of each ML model, we applied particle swarm optimization (PSO) to finetune hyperparameters.(4) The SHAP method was applied to elucidate the contribution of each feature to the risk of developing DPN and LEAD in the best-performing models.This analysis identified both shared and distinct risk factors for DPN and LEAD, deepening our insight into their pathophysiological foundations.Concentrating on shared risk factors may significantly reduce the prevalence of these conditions and subsequently the risk of DFUs.

Study design and participants
This is a cross-sectional study conducted at Tongji Hospital from January 2022 to March 2023.We collected clinical characteristics and laboratory data of 712 diabetic inpatients who underwent EMG and lower limb vascular ultrasound examinations.Patients were excluded from the study if they had diabetic ketoacidosis (DKA), hyperosmotic hyperglycemia syndrome (HHS), autoimmune diseases, infectious diseases, malignant tumors, or if more than 30% of their data was missing.Ultimately, 479 diabetic patients were enrolled in this study.This research was performed in compliance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and received approval from the Institutional Ethics Committee (K-2023-022).

Diagnostic criteria
All diabetic patients enrolled in our study underwent screening for DPN and LEAD during hospitalization.The diagnosis of diabetic complications was made by two qualified endocrinologists based on the locally recognized criteria [17].A Dantec , urinary microalbumin and creatinine, and the urinary albumin-to-creatinine ratio (UACR).We calculated the estimated glomerular filtration rate (eGFR) using a formula provided in a previous study [18].In addition, we collected the results of carotid artery ultrasound.Bilateral common carotid artery, internal carotid artery, and external carotid artery were examined using a Philips IU22 Doppler ultrasonic color imaging system (Philips, USA) equipped with a 3-D array probe (7)(8)(9)(10)(11)(12).The dataset, encompassing all features and their respective values, was detailed in Supplementary Table S1.

Missing data
Comprehensive demographic information was available for all participants, as each patient underwent the hospitalization process.Missing data for laboratory parameters were below 30%.We addressed these missing values using the most recent available measurements.
Any remaining missing values were imputed using the median.

Data balancing
In constructing models for LEAD, we encountered a class imbalance issue due to the low proportion of patients with LEAD in the overall population.To address this problem, we used the imbalanced-learn package in Python to employ a random undersampling technique, achieving a 1:3 ratio between the LEAD and non-LEAD groups [19].This approach helped us to maintain a more balanced ratio and decrease the number of non-LEAD cases to three times the number of LEAD cases, thus mitigating the imbalance and enhancing the reliability of our models.

Feature selection strategy
The feature selection process was conducted using three distinct methods: MI, RF-RFE, and the Boruta algorithm.MI quantifies the dependency between variables by capturing all types of relationships, both linear and nonlinear.For feature selection, MI assesses the dependency of each feature on the target label to identify the most informative features for prediction [20].RF-RFE utilizes a RF to iteratively build models, systematically removing the least important features in each round.This method emphasizes features that significantly affect model performance [21].The Boruta algorithm employs a RF classifier to evaluate features against their randomized "shadow" versions, ensuring only essential features are retained for accurate model predictions [22].
For both MI and RF-RFE, the top 15 features were identified independently.The Boruta algorithm categorized features as confirmed, tentative, or rejected, selecting features that were either confirmed or tentative.Only features chosen by at least two of these three methods were used to develop ML models.This approach reduces redundancy and enhances the predictive accuracy of the ML models.

ML model construction and interpretation
The model was constructed and interpreted using Python (version 3.9.6,Python Software Foundation, USA).The workflow for constructing and interpreting ML models is illustrated in Fig. 1.First, the dataset was randomly divided into two subsets: 80% designated for training the model and the remaining 20% reserved for testing.Then, three distinct ML models-LR, RF, and XGBoost-were developed to predict DPN and LEAD based on selected features.To optimize these models and select the most suitable hyperparameters, we employed PSO.PSO is a computational method that mimics a swarm of particles navigating through the parameter space to find optimal solutions.In ML, PSO enhances model parameterization by representing each particle as a potential solution that is continuously refined through both individual and collective experiences within the swarm.This strategy efficiently identifies the best parameter combinations, significantly improving model performance [23].The set and optimal value of hyperparameters were displayed in Supplementary Table S2.
The effectiveness of each model was evaluated using various metrics, including recall, specificity, precision, accuracy, and the F1 score.We also calculated the area under the receiver operating curve (AUC) for the test sets to assess the performance of each model.Furthermore, the SHAP method was employed to interpret the contribution of each predictor within the optimal models.Through SHAP analysis, we gained a detailed understanding of how each feature influences the model's output, providing a comprehensive insight into the model's decision-making process [24].

Statistical analysis
The statistical analyses were performed using SPSS (version 27.0, IBM, USA).For data adhering to a normal distribution, values were depicted as mean ± standard deviation.Differences among these values were examined using the independent Student's t-test.Conversely, for data not following a normal distribution, variables were presented as medians (interquartile range, IQR), and the Mann-Whitney U test was employed to evaluate disparities in their distributions.Categorical data were represented as n (%) and analyzed for distribution differences via the Chi-square (χ 2 ) test or Fisher's exact test, as appropriate.A p-value < 0.05 was considered statistically significant.

Clinical features of patients
In this study, we initially enrolled 712 diabetic inpatients.After applying exclusion criteria, 479 patients qualified for inclusion.Among them, 215 were diagnosed with DPN, and 69 with LEAD.The median age of participants was 50 years (IQR: 48-56), and the male-to-female ratio was 0.58.All these cases were utilized to develop models for diagnosing DPN.To correct for the imbalance in sample sizes, a one-to-three random undersampling strategy was employed for LEAD cases versus non-LEAD controls, resulting in 69 LEAD cases and 207 non-LEAD cases being selected to construct LEAD prediction models (Fig. 2).According to univariate analysis, out of the 38 features, 24 exhibited significant discrepancies between patients with and without DPN, whereas 17 features displayed differences between those with and without LEAD (Tables 1 and 2).Patients with DPN or LEAD were found to be older compared to those without these Fig. 1 ML model development and evaluation process complications.Men were more likely to develop DPN or LEAD than women.Moreover, patients with DPN and LEAD exhibited increased levels of HbA1c, SUN, Scr, FBG, D-dimer, NLR, urinary microalbumin, and UACR compared to those without these complications.In contrast, the levels of serum albumin, eGFR, TC, and LDL were found to be lower in patients with DPN and LEAD.Additionally, a positive association was observed between the presence of carotid stenosis and the occurrence of DPN and LEAD.

Selected features
For DPN, a consensus was reached on eight features selected by all three feature selection methods.An additional four features were agreed upon by two of the methods, resulting in a total of 12 distinct features that were incorporated into the models, as outlined in Supplementary Table S3.Similarly, for LEAD, unanimous selection was achieved for five features across all methods, with another four features chosen by two of the methods.Thus, a total of nine features were integrated into the ML models, as detailed in Supplementary Table S4.

Diagnostic performance of LR, RF, and XGBoost in detecting DPN
The diagnostic performances of three models for detecting DPN were shown in Figs.3A and 4A.Among these models, XGBoost demonstrated the highest diagnostic efficacy, achieving an AUC of 0.903, a recall of 83.7%, a specificity of 86.8%, an accuracy of 85.4%, a precision of 83.7%, and an F1 score of 83.7%.Additionally, RF showed the highest specificity, at 90.6%.

Diagnostic performance of LR, RF, and XGBoost in detecting LEAD
The performances of the LEAD models were presented in Figs.3B and 4B.The RF model outperformed the others with the highest AUC of 0.923, recall of 85.7%, specificity of 92.9%, accuracy of 91.9%, precision of 80.0%, and an F1 score of 82.8%, followed by XGBoost and LR.

Critical shared and unique risk factors for DPN and LEAD through SHAP analysis
SHAP was applied to evaluate the importance of features within the optimal ML models for DPN and LEAD, with a prioritized list vividly illustrating their respective impacts.Figure 5A and B presented the rankings of critical features in the XGBoost model for DPN and the RF model for LEAD, respectively.This analysis highlighted the importance of both shared and unique risk factors.Common risk factors identified for both conditions include increased UACR, elevated HbA1c, elevated Scr, advanced age, carotid stenosis, high FBG, and reduced eGFR.Unique to DPN were decreased serum albumin and lower lymphocyte count, whereas LEAD was specifically associated with increased NLR and higher D-dimer levels.

Discussion
This study constructed three different ML models for predicting DPN and LEAD among diabetic patients, utilizing basic clinical and laboratory data.We discovered that the XGBoost model demonstrated superior diagnostic performance in detecting DPN, whereas the RF model excelled in identifying LEAD.Furthermore, SHAP analysis identified the top five important risk factors common to both conditions: elevated UACR, HbA1c, Scr, advanced age, and carotid stenosis.Additionally, it pinpointed unique risk factors for each condition: a decrease in serum albumin and lymphocyte count were significant for DPN, while increased NLR and D-dimer were key indicators for LEAD.
ML models are significantly advancing the field of medical diagnostics.Recent advancements in predicting DPN and LEAD were summarized Table 3.For DPN detection, Metsker et al. [25] developed ML models using age, gender, and 27 laboratory tests.Among these models, the artificial neural network (ANN) achieved the highest recall at 0.809, the LR had the highest precision at 0.683, while Linear Regression displayed both the highest F1 score at 0.730 and the highest accuracy at 0.747.Another study demonstrated that, using demographic, clinical, and laboratory data, both RF and SVM models significantly distinguished DPN in individuals with T2DM.The accuracy, sensitivity, and specificity were 67.8%, 68.09%, and 67.44% for RF, and 67.8%, 68.89%, and 66.67% for SVM, respectively [26].By contrast, our study showed that the XGBoost model had the highest diagnostic performance, with an accuracy of 85.4%, a sensitivity of 83.7%, and a specificity of 86.8%, which were much higher than those reported in previous research.For LEAD, our RF model showed superior performance, aligning with previous findings that highlighted the RF model's enhanced predictive capabilities over the LR model [4].Of note, the improved performance in previous studies was attributed to the inclusion of the anklebrachial pressure index (ABI), a common indicator for diagnosing LEAD.Our study, however, relied solely on clinical data and routine laboratory tests to construct ML models.The remarkable performance of our models can be attributed to our methods of feature selection and hyperparameter optimization.We combined three different methods-MI, RF-RFE, and the Boruta algorithmto identify the most significant features.This approach significantly reduces overfitting and enhances the robustness of our models.Besides, PSO was applied to optimize hyperparameters.Unlike traditional methods such as grid search, PSO does not rely on fixed parameter value range and step size, making it particularly effective for complex optimization challenges with large parameter spaces.
SHAP, a game theory-based method, was used in this study to identify key risk factors for DPN and LEAD.The analysis revealed that the primary risk factors common to both conditions were increased UACR, HbA1c, Scr, advanced age, and carotid stenosis.Notably, UACR was ranked as the most crucial predictor for DPN and the third most significant for LEAD.This finding was consistent with a large retrospective cohort study that identified UACR as a crucial predictor for DPN [27].Additionally, a 30% or greater increase in UACR was reported to be a risk factor for the onset of DPN [28].Previous studies also discovered that UACR served as a biomarker for the early detection of LEAD [29] and a risk factor for mortality in LEAD patients [30].This underscored the critical importance of regular UACR monitoring to prevent DPN and LEAD, thereby potentially reducing the risk of DFUs.
As expected, HbA1c and older age were critical shared risk factors for both conditions, aligning with previous studies [27,[31][32][33].Unlike FBG, which can fluctuate significantly due to various factors, HbA1c provides a more stable measure of blood glucose levels over the preceding three months.Chronic hyperglycemia in diabetes contributed to the development of DPN through mechanisms such as increased oxidative stress and inflammation [34].These processes disrupt blood flow to peripheral nerves and impair nerve function.Chronic hyperglycemia can also cause damage to endothelial cells and thicken the intima-media layer in blood vessels, particularly in the lower extremities [35].Additionally, as individuals age, the key components of the extracellular matrix, particularly elastic fibers, are subjected to degradation and fragmentation.Age-related increases in cross-linking between collagen fibers could further contribute to the development of arterial stiffness [36], which may diminish blood flow to nerves and affect their repair capabilities, potentially increasing the prevalence of DPN [37].These findings underscored the importance of maintaining good glycemic control, especially in older patients.
Scr is a key marker for kidney function, with elevated levels often indicating renal damage.Impaired kidney function can affect the microcirculation in distant organs [38], potentially compromising blood flow to peripheral nerves and arteries, which increases the risk of DPN and LEAD.Moreover, carotid stenosis was also recognized as a significant risk factor for both conditions.While the direct link between carotid stenosis and DPN is less studied, recent findings suggested that carotid atherosclerosis, the primary cause of carotid stenosis, could independently predict small fiber nerve dysfunction in individuals with T2DM [39].Furthermore, a cross-sectional study of 653 patients with LEAD found that 415 (63.5%) had carotid stenosis [40], implying that carotid stenosis may be a contributing risk factor for LEAD.Therefore, diabetic patients should also pay more attention on kidney function and neck vascular health to reduce the prevalence of DPN and LEAD.
Furthermore, unique risk factors were also identified.For DPN, decreased serum albumin was a critical predictor.Among patients with T2DM, a serum albumin level below 36.75g/L was independently associated with impaired peripheral nerve function, with a sensitivity of 65.6% and a specificity of 78.0% for detecting abnormal function in those with albuminuria [41].Recent studies further supported the inverse relationship between serum albumin levels and the prevalence of DPN among T2DM patients [42,43].These findings suggest that serum albumin may play a protective role against the development of DPN, potentially due to its antioxidant, anti-inflammatory, and anti-atherosclerotic properties [42].Another unique but often overlooked risk factor for DPN was lymphocyte count.Both serum albumin and lymphocyte count are indicators for nutritional status [44], highlighting the importance for patients with DPN to closely monitor and manage their nutrition.For LEAD, elevated NLR was identified as a unique key risk factor, consistent with previous studies [45,46], which discovered that NLR was positively related with the prevalence of LEAD.D-dimer was identified as another crucial predictor for LEAD.In a prospective cohort study, patients with LEAD had significant higher levels of D-dimer than those without LEAD [47].In addition, the levels of D-dimer were observed to increase with the severity of LEAD [48].Elevated D-dimer levels may reflect the extent of atherosclerosis, as they indicate ongoing fibrin formation and degradation [49].

Conclusion
Our study underscored the potential of ML models in predicting DPN and LEAD diabetic patients.We found that XGBoost showed superior performance in identifying DPN, whereas RF model was more effective for diagnosing LEAD.SHAP analysis revealed the top five most critical risk factors common to both conditions, including elevated UACR, HbA1c, Scr, advanced age, and carotid stenosis.Additionally, unique predictors were identified for each condition: decreased serum albumin and lymphocyte count were associated with DPN, whereas increased NLR and D-dimer levels were linked to LEAD.These insights underscored the complexity of managing DPN and LEAD, emphasizing the need for personalized and comprehensive treatment strategies.Implementing these insights could enhance early detection and management of these diabetic complications, particularly beneficial in regions with limited medical resources.Prioritizing the management of shared risk factors, like glycemic control, renal function, and macrovascular health, may reduce the frequency of DPN and LEAD, thereby decreasing the risk of DFUs.Patients with DPN should also focus on maintaining good nutritional health.For future progress, research should be expanded to include a broader and more diverse population, and investigate the feasibility of developing a unified ML model capable of predicting both DPN and LEAD in individuals with diabetes.

Fig. 3 Fig. 4
Fig. 3 ROC curves of LR, RF, and XGBoost models for detecting DPN and LEAD.A ROC curves for DPN; B ROC curves for LEAD.ROC, receiver operating characteristic.LR, logistic regression.RF, random forest.DPN, diabetic peripheral neuropathy.LEAD, lower extremity arterial disease

Fig. 5
Fig. 5 Feature importance of SHAP values for XGBoost model in detecting DPN and for RF model in detecting LEAD.A SHAP values of XGBoost model in detecting DPN; B SHAP values of RF model in detecting LEAD.SHAP, SHapley Additive exPlanation.RF, random forest.DPN, diabetic peripheral neuropathy.LEAD, lower extremity arterial disease

Table 3
Comparative analysis of the proposed work with previous studies for DPN and LEAD prediction models DPN diabetic peripheral neuropathy, LEAD lower extremity arterial disease, LR logistic regression, ANN artificial neural network, SVM support vector machine, RF random forest XGBoost RF excelled, with an accuracy of 91.9%, a recall of 85.7%, a specificity of 92.9%, a precision of 80.0%, and an AUC of 0.923